NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate

https://doi.org/10.1287/opre.2024.0854

Lin, Yifan; Wang, Yuhao; Zhou, Enlu (May 2025, Operations Research)

Theoretical Findings Validate Historical Data Reuse for Improved Policy Optimization A new study, “Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate” by Yifan Lin, Yuhao Wang, and Enlu Zhou, explores an advanced approach to reinforcement learning. The research focuses on improving policy optimization by reusing historical trajectories through importance sampling in natural policy gradient methods. The authors rigorously analyze the convergence properties of this approach and demonstrate that reusing past data enhances convergence rates while maintaining theoretical guarantees. Their findings have practical implications for applications where data collection is costly or limited, such as robotics and autonomous systems. By integrating these insights into policy optimization frameworks, the study provides a valuable contribution to the field of reinforcement learning.
more » « less
Free, publicly-accessible full text available May 14, 2026
Bayesian risk-averse Q-learning with streaming observations

Wang, Yuhao; Zhou, Enlu (May 2024, Zhou, Enlu)

Full Text Available
Contextual Ranking and Selection with Gaussian Processes and Optimal Computing Budget Allocation

https://doi.org/10.1145/3633456

Cakmak, Sait; Wang, Yuhao; Gao, Siyang; Zhou, Enlu (April 2024, ACM Transactions on Modeling and Computer Simulation)

In many real-world problems, we are faced with the problem of selecting the best among a finite number of alternatives, where the best alternative is determined based on context specific information. In this work, we study the contextual Ranking and Selection problem under a finite-alternative-finite-context setting, where we aim to find the best alternative for each context. We use a separate Gaussian process to model the reward for each alternative and derive the large deviations rate function for both the expected and worst-case contextual probability of correct selection. We propose the GP-C-OCBA sampling policy, which uses the Gaussian process posterior to iteratively allocate observations to maximize the rate function. We prove its consistency and show that it achieves the optimal convergence rate under the assumption of a non-informative prior. Numerical experiments show that our algorithm is highly competitive in terms of sampling efficiency, while having significantly smaller computational overhead.
more » « less
Full Text Available
Bayesian Stochastic Gradient Descent for Stochastic Optimization with Streaming Input Data

https://doi.org/10.1137/22M1478951

Liu, Tianyi; Lin, Yifan; Zhou, Enlu (March 2024, SIAM Journal on Optimization)

Full Text Available
Input Data Collection Versus Simulation: Simultaneous Resource Allocation

https://doi.org/10.1109/WSC60868.2023.10408130

Wang, Yuhao; Zhou, Enlu (February 2024, Zhou, Enlu)

Full Text Available
Reusing Historical Observations in Natural Policy Gradient

https://doi.org/10.1109/WSC60868.2023.10407512

Lin, Yifan; Zhou, Enlu (February 2024, Zhou, Enlu)

Full Text Available
A Distributed Bayesian Data Fusion Algorithm with Uniform Consistency

https://doi.org/10.1109/TAC.2024.3375254

Li, Yingke; Zhou, Enlu; Zhang, Fumin (January 2024, IEEE Transactions on Automatic Control)

Full Text Available
Bayesian Distributionally Robust Optimization

https://doi.org/10.1137/21M1465548

Shapiro, Alexander; Zhou, Enlu; Lin, Yifan (June 2023, SIAM Journal on Optimization)

Full Text Available
Fixed Budget Ranking and Selection with Streaming Input Data

https://doi.org/10.1109/WSC57314.2022.10015327

Wang, Yuhao; Zhou, Enlu (December 2022, Winter Simulation Conference)

Full Text Available
Data-Driven Ranking and Selection Under Input Uncertainty

https://doi.org/10.1287/opre.2022.2375

Wu, Di; Wang, Yuhao; Zhou, Enlu (October 2022, Operations Research)

We consider a simulation-based ranking and selection (R&S) problem with input uncertainty, in which unknown input distributions can be estimated using input data arriving in batches of varying sizes over time. Each time a batch arrives, additional simulations can be run using updated input distribution estimates. The goal is to confidently identify the best design after collecting as few batches as possible. We first introduce a moving average estimator for aggregating simulation outputs generated under heterogenous input distributions. Then, based on a sequential elimination framework, we devise two major R&S procedures by establishing exact and asymptotic confidence bands for the estimator. We also extend our procedures to the indifference zone setting, which helps save simulation effort for practical usage. Numerical results show the effectiveness and necessity of our procedures in controlling error from input uncertainty. Moreover, the efficiency can be further boosted through optimizing the “drop rate” parameter, which is the proportion of past simulation outputs to discard, of the moving average estimator.
more » « less
Full Text Available

« Prev Next »

Search for: All records